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As one of the near-term goals of the President’s Vision for Space Exploration, establishment of a multi-person 
lunar base will require high-endurance power systems which are independent of the sun, and can operate without 
replenishment for several years. These requirements may be obtained using nuclear power systems specifically 
designed for use on the lunar surface. While it is envisioned that such a system will generally be supervised by 
humans, some of the evolutions required may be semi or fully autonomous. The entire base complement for near- 
term missions may be less than 10 individuals, most or all of which may not be qualified nuclear plant operators and 
may be off-base for extended periods-thus, the need for power system autonomous operation. Startup, shutdown, 
and load following operations will require the application of advanced control and health management strategies 
with an emphasis on robust, supervisory, coordinated control of, for example, the nuclear heat source, energy 
conversion plant (e.g., Brayton Energy Conversion units), and power management system. Autonomous operation 
implies that, in addition to being capable of automatic response to disturbance input or load changes, the system is 
also capable of assessing the status of the integrated plant, determining the risk associated with the possible actions, 
and making a decision as to the action that optimizes system performance while minimizing risk to the mission. 
Adapting the control to deviations from design conditions and degradation due to component failures will be 
essential to ensure base inhabitant safety and mission success. Intelligent decisions will have to be made to choose 
the right set of sensors to provide the data needed to do condition monitoring and fault detection and isolation- 
because of liftoff weight and space limitations, it will not be possible to have an extensive set of instillments as used 
for earth-based systems. 

Advanced instrumentation and control technologies will be needed to enable this critical functionality of 
autonomous operation. It will be imperative to consider instrumentation and control requirements in parallel to 
system “configuration” development so as to identify control-related, as well as integrated system-related, problem 
areas early to avoid potentially expensive “work-arounds”. This paper presents an overview of the enabling 
technologies necessary for the development of reliable, autonomous lunar base nuclear power systems with an 
emphasis on system architectures and off-the-shelf algorithms rather than hardware. Autonomy needs are presented 
in the context of a hypothetical lunar base nuclear power system. The scenarios and applications presented are 
hypothetical in nature, based on information from open-literature sources, and only intended to provoke thought and 
provide motivation for the use of autonomous, intelligent control and diagnostics. 

I. Introduction 

To meet the goals of the President’s vision for exploration of our solar system, several key enabling technologies 
need to be matured (ref. 1). Among them, is the need for high-endurance lunar base surface power systems which 
are independent of the sun, and can operate without replenishment for several years (ref. 2). Additionally, the total 
base compliment for near-term missions may be less than 10 individuals, most or all of which may not be qualified 
nuclear plant operators and may be off-base for extended periods, requiring that the power system operate 
autonomously (ref. 3). NASA and the Department of Energy are currently teaming to provide nuclear power sources 
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to fulfill this need. Mission objectives and extended durations warrant the development of nuclear power sources- 
many of those in the trade space being novel designs which have never been built or operated on earth. While each 
of the individual components may have their own, well known dynamic operating characteristics (at the present 
scales and state of the art), the characteristics of the integrated system with the component designs at the scale 
required by the proposed lunar missions are at present unknown. Also, feedback controllers designed for the 
individual components (again, at the current designs and sizes), may require significant redesign after integration of 
the component as part of the power system-not necessarily for nominal operation but for off-nominal scenarios that 
may originate in other parts of the system. This suggests that an autonomous “supervisor,” capable of rapidly 
assessing the state of each component and its effect on the performance of the overall plant, be employed. This 
function has, for the most part, been performed by employing the human-in-the-loop supervisory control paradigm. 
While terrestrial nuclear power plants and naval nuclear propulsion plants require a relatively small number of 
humans for operations (e.g., 10 to 40 on shift at a given time), truly autonomous plant operation has yet to be 
demonstrated for a system of this complexity with the proposed operational requirements. Thus, the characteristic 
essential for success of these missions lies in the ability of the power plant to make decisions and take actions on its 
own-in other words to operate autonomously. For evolutions such as startup, power down/up, load following, and 
defense-in-depth malfunction recovery. 

This paper presents an overview of the enabling technologies necessary for the development of reliable, 
autonomous lunar base nuclear power systems-with an emphasis on system architectures and off-the-shelf 
algorithms rather than hardware. It is acknowledged that each topic presented represents a significant research effort 
in its own right, with the ultimate control and diagnostic solutions dependent on the final design. Emphasis is placed 
on the proposed near-term missions (through the year 2030) which are not intended to be basic research endeavors 
from the lunar base power system perspective, with milestones and schedules that need to be met with engineering 
solutions to the challenges that arise. 

There are literally scores of potential control and diagnostic solutions available. However, it is the opinion of the 
authors that a prudent down-selection needs to be performed concurrent with the design of the base power plant, 
rather than in the final stages of design where sensing or control requirements may not be achievable with the 
assigned sensor locations or computational capabilities. This paper is intended to provide a conceptual basis for such 
a down-selection. Autonomy needs are presented in the context of a hypothetical lunar base nuclear power system 
and may be extended to significantly different nuclear applications. The scenarios and applications presented are 
hypothetical in nature, based on information from open-literature sources, and are intended only to provoke thought 
and provide motivation for the use of autonomous, intelligent control and diagnostics in lunar base applications. 

II. A Hypothetical Lunar Base Nuclear Power System 

There are several nuclear power options in the trade-space for lunar base support in the 21st century. Some of the 
more common system features include a nuclear heat source, an energy conversion unit used to convert heat 
produced in the nuclear heat source to electrical power, and a power management and distribution (PMAD) system 
used to route power from the energy conversion system to the lunar base electrical loads. For the present study, a 
hypothetical power system configuration (fig. 1) consists of a nuclear heat source (NHS), redundant Brayton cycle- 
based power conversion systems (PCS), waste heat radiators (WHR), power management and distribution (PMAD) 
systems, parasitic load resistors (PLR), and lunar base electrical loads. As shown in figure 1 via the bidirectional 
arrows, it is anticipated that there will be dynamic interaction between those systems that have a direct interface e.g., 
the NHS-Brayton unit interface, as well as indirect effects that propagate through several different subsystems and 
affect the component of interest. The exact nature and magnitude of these interactions is, of course, dependent on the 
final design of the components. For the hypothetical power system provided, PCS load changes result in varying 
amounts of heat transferred from the nuclear heat source primary system to the Brayton working fluid (e.g., Helium- 
Xenon) through an intermediate heat exchanger. The NHS, employing a negative temperature feedback 
characteristic, would naturally follow Brayton unit load transients. However, normal operation has the NHS (and 
Brayton unit) power constant over the majority of the mission, with load transients accommodated by diverting 
excess power to the PLR and radiating the heat produced into space. This load accommodating technique has been 
successfully demonstrated during ion thruster recycle transients at NASA Glenn Research Center using a 2 kW 
Brayton turbo alternator supplying thruster power via a PMAD (ref. 4). As mentioned previously, normal operation 
(as well as anticipated, high-probability component malfunctions-accommodated via redundancy) does not 
necessarily justify the need for autonomous, intelligent, and supervisory control. The need for autonomy arises 
when the power system encounters off-normal situations that may result in a variety of outcomes, depending on the 
actions taken by the supervisor, that require a decision to be made in real-time. 
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Figure 1. — Hypothetical lunar base nuclear power system. 

III. Motivating Scenarios for Autonomous Lunar Base Nuclear Power Systems 

Consider as a motivating example the hypothetical scenario where the switching circuitry that directs power flow 
to the PLR suffers a malfunction. A large, rapid loss of load occurs and excess power that would normally be 
radiated out into space via the PLR gets diverted to a second, redundant PLR. The situation deteriorates further with 
the redundant PLR’s circuitry also failing due to some common mode event, causing the Brayton Unit’s alternator to 
overspeed and consequently trip off line, resulting in significantly less heat being transferred from the NHS coolant 
to the Brayton’s working fluid in the intermediate heat exchanger. The temperature of the coolant flowing back to 
the NHS rises, with the inherent negative temperature feedback effect driving the NHS power down. Since the 
Brayton unit has tripped off line, power that would normally be supplied to the NHS coolant pumps and control 
drive mechanisms is no longer available, resulting in a system shutdown. During the initial stages of this loss of load 
transient an autonomous system’s supervisor could anticipate this possible fault scenario and decide to initiate an 
alternate control strategy to minimize the risk to the human inhabitants and the base e.g., performing a limited 
runback of the entire system, thus preventing a turbo alternator trip due to a double PLR loss. The extent of the 
runback would be determined by the evolution of the malfunction i.e., as the situation deteriorates the likelihood of a 
system failure increases, resulting in more drastic measures being taken on the part of the supervisor. Thus, the 
autonomous system will maintain power to critical systems and also provides a more reliable system in general via a 
defense-in-depth strategy. 

The above is an example of functional redundancy (the autonomous control system deciding to initiate a system 
runback) providing support to hardware-based block redundancy (the two PLRs). The Cassini-Huygens mission 
system architecture (ref. 6) provides an example of a similar defense-in-depth strategy. For terrestrial nuclear 
systems, functional redundancy is often provided by plant operators with plant shutdown a realistic (albeit 
undesirable) alternative, with backup power for customers readily available from the power grid. For lunar base 
applications no such backup power source may be available. Additionally, the entire sequence of events described 
could occur over a period shorter than the information processing and response time of the human plant supervisors, 
illustrating the essential need for autonomy in systems where fault conditions have the potential to deteriorate 
rapidly. 

Other relatively routine operating evolutions would also benefit from an autonomous supervisory control system 
with varying degrees of intelligence. For operation involving automated up-power from extended low power periods 


NASA/TM— 2005-213839 


3 




and load following transients, a candidate controller algorithm would be optimized feedforward-robust feedback 
control. This technique uses optimized actuator commands computed in advance (possibly using a genetic 
algorithm-based optimization scheme), with robust feedback controllers to accommodate system uncertainties and 
real-world effects such as actuator saturation (it is prudent to assume that due to the aggressive schedules presented, 
a “perfect” integrated system model will most likely not exist at the time of base construction). If a shutdown of the 
entire system is implemented, the subsequent startup sequence is crucial not only from the perspective of human 
safety (i.e., maintaining life support), but also from the perspective of operability e.g., the NHS and Brayton system 
components require the appropriate time to heat-soak their internals prior to initiating the next phase of startup. A 
potential concern regarding NHS startup involves brittle fracture of the NHS vessel-heat up and pressurization must 
be coordinated in order to maintain vessel integrity. The power plant supervisor might employ Multiple Input 
Multiple Output-type (MIMO) subsystem controllers to effectively coordinate the operation of multiple Brayton 
units, loads, etc., while accommodating uncertainties in the models used to synthesize the controllers. As well, 
supervisory algorithms could be developed to address issues such as optimal load distribution during transients. 
Once again, due to the interaction of several subsystems, each with their own dynamic (feedback) characteristics, 
precise coordination of the integrated system is essential. Autonomy during well-planned, perfectly understood, 
operating scenarios may still be thought of as unnecessary. However, to accommodate unanticipated events, a 
supervisor, continuously monitoring the progress of each subsystem, additionally determines alternate minimum-risk 
control strategies to use in the event of off-normal occurrences or malfunctions. 

IV. A Supervisory Control Architecture for 
Autonomous Lunar BaseNuclear Power Systems 

The exploration of space has associated with it many sources of uncertainty e.g., communication delays, system 
degradation, and rare (but possible) events such as meteoroid impact. These need to be accommodated by a 
supervisory control structure that integrates controls, diagnostics and decision making ability for the entire system. 
Autonomous operation can be achieved by applying a high-level supervisory architecture, such as the hierarchical 
discrete event supervisory (DES) control structure presented in Yasar et al. (ref. 8). This type of architecture would 
enable different degrees of system autonomy at various hierarchical levels, while giving human plant supervisors 
overall system control and override capabilities. 

External commands from off-base human plant supervisors are communicated directly to the top level DES, the 
Integrated Power System Supervisor (IPSS). The commands then propagate through the system via the DES 
hierarchy. Each hierarchical level performs the commands and controls the levels below it. For example, figure 2 
presents an architecture with the various subsystems of the integrated power plant, each with their own component 
supervisor, control, and diagnostic systems, interacting with an integrated power system supervisor. Highlighted in 
the figure is the Brayton power conversion system interacting with its own DES, or component supervisor. The IPSS 
informs the other component supervisors to perform the commands to meet the highest level commands, typically 
from the off-base human plant supervisor. This hierarchical architecture results in a low level of autonomy for the 
highest hierarchical level, the IPSS; a high level of autonomy for the lowest hierarchical level, the component 
supervisors, thereby giving subsystem and component level control to the designer. As additional hierarchical levels 
are added, the component supervisors will have more autonomy, because they will become further removed from 
direct IPSS command. In certain situations, mission planners may need direct control of the component, which can 
be achieved by designing a DES override. 

Using this architecture, Yasar et al. (ref. 8) have shown an increased mission success rate during off-nominal 
flight of aircraft. The adaptation of hierarchical DES to space nuclear systems would involve identifying 
uncertainties and accommodating them in the DES design. Figure 3 presents an overview of the diagnostics and 
prognostics portion of the component supervisor. Success of the hierarchical strategy presented is heavily dependent 
on the effectiveness of the diagnostics and prognostics system. Optimal sensor selection and placement, which is 
accomplished early in the design of each subsystem, is essential for successful diagnostics and prognostics. 
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Figure 2. — An example supervisory control architecture for an autonomous lunar base nuclear power system. 
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Figure 3. — Component supervisor on-base diagnostics module. 

V. General Needs for Autonomous System Operation 

The needs for autonomous system operation originate in well-defined mission goals and objectives. For the 
purposes of this study, the power system mission objective is to provide safe, continuous, maintenance-free 
operation for 15 years while accommodating all necessary evolutions and faults (un-hypothesized as well as 
hypothesized). An example autonomous system architecture was presented in section IV. This section discusses the 
areas that need to be addressed in order to provide this level of autonomy for the lunar base power system presented 
in section II. Implicit is the need for appropriate placement of physical sensors, analytic or virtual sensor 
capabilities, methods to minimize system status uncertainty via, for example, sensor fusion techniques, the ability to 
extend the useful life of the components via intelligent control, diagnostic techniques and health management, and 
the development tools (i.e., the infrastructure) necessary to provide rapid-prototyping capabilities for subsequent 
down-select. The needs for autonomous system operation may be broadly divided into the following areas that 
define the scope and implementation of the autonomy desired: Guidelines for Fligh-level Autonomy, Functional 
Description of an Autonomous Control System, Faults and Failure Modes, Hazards Analysis, Metrics for an 
Autonomous Control System, and Software Quality Assurance and V&V. Each of these areas is described below. 

Guidelines for High-Level Autonomy. 

• The autonomous control system will enable the power system to meet the operational power requirements 
for all mission phases, during nominal and off-nominal conditions. 

• The autonomous control system will identify, mitigate and resolve system faults. 

• The autonomous control system will identify system degradations and use recovery actions, if needed, to 

maintain mission objectives. 

• The autonomous control system will document design intent, and any changes. 

• The autonomous control system will allow for on-base software upgrades. 

• The autonomous control system will be verified and validated. 
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Functional Description of an Autonomous Control System. — An autonomous control system can be viewed as 
doing three things: observing, processing, and managing. Each of these can be further reduced to the following: 

Observing: 

• Monitoring-observations from sensors and calculated health measures. 

• Recording-data, commands, status bits, reports, and log fdes. 

Processing: 

• Tracking-using mental map (models/rules) of expected nominal system behavior. 

• Condition Assessment-detecting threshold crossings or mode changes. 

• Inferencing-reasoning about likely causes for anomalous observations. 

Managing: 

• Executing — interacting with vehicle to control it. 

• Planning-whether to continue on course or to re -plan under failure conditions. 

Wood et al. (ref. 9) suggest the three basic building blocks of an autonomous control system are control 
algorithms, diagnostic algorithms, and decision algorithms. The various inputs to and outputs from these functional 
blocks are commands, observations, status readings, and decision feedback. In addition they also recommend a 
hierarchical framework for autonomous control of the NHS with interfaces to other systems. The layers in this 
hierarchy reflect the functional layers of the physical system. So the design of the controller will be understood and 
inspected more easily, which leads to a safer design. Autonomous control involves the complete path from sensing 
the value of a particular measurement to the execution of a particular action that enables optimal performance. In 
general, techniques in classification, inference, projection, and decision making can be applied to areas such as 
diagnosis and prognosis. How to balance the partition between diagnosis, prognosis and “designing-out” failures is 
still an emerging art for space applications. 

Diagnosis can be performed using inductive learning such as decision trees, case-based reasoning, rule-based or 
model-based approaches, explanation-based learning, genetic algorithms for search and optimization, neural 
networks, fuzzy learning, and soft computing techniques. The selection of specific diagnostic approaches has to be 
guided by their respective strengths and limitations, and by their suitability for the particular control architecture 
being considered. Choosing the right control architecture is an important step in the top-down design process from 
requirements to software implementation (from abstract function to specific implementation). 

Faults and Failure Modes. — The design space consists of what the autonomous control system is required to do 
(capability requirements), and the constraints within which it must operate, such as plant constraints, domain 
knowledge constraints, and actuator constraints. Within this space, we can identify what fault types are required to 
be covered. Faults can be broadly categorized as: 

• When they occur: during the initial phase, during the regular operation phase or end-of-mission phase (there 
is usually a “bathtub curve” for probability of fault occurrence versus time). 

• How long they last: whether permanent, temporary, or intermittent. 

• How they manifest: whether discrete or continuous (e.g., leak), abrupt or gradual. 

• How they relate to other faults: whether independent, correlated, cascading, or simultaneous faults with a 
common cause. 

• What causes them: inherent in design (structural or functional), due to uncertainties in operating envelope, or 
due to external conditions. 

• What they affect: the component, the subsystem or the system (or all three). 

• How critical are they: non-critical, recoverable or mission-critical. 

Fault knowledge involves fault models, component failure rates, and fault/symptom associations (context- 
dependent knowledge). Expected failure rates are usually outlined in a Failure Modes and Effects Analysis (FMEA) 
(ref. 10). Other domain knowledge comes from schematics, block diagrams, instrumentation list (sensor type, 
location, number, redundancy, and sampling rates), telemetry list and the operations timeline. There may also be 
information from fault analyses-fault trees, event trees, and probabilistic risk assessment. 

Density and coverage of faults is a useful estimate to have, early in the design phase. Sensor selection and 
placement can be guided by such models of fault coverage. Testability analysis tools provide good estimates of fault 
coverage. 
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Hazards Analysis. — It must be noted that failures are different from hazards (ref. 11). Hazards are conditions 
that can lead to an accident, even when no component has failed. Hazard Level is a combination of severity and 
likelihood of occurrence of the hazard. For lunar missions some likely hazards are: 

• Contamination/corrosion-chemical disassociation of material 

• Radiation-Nuclear, Electromagnetic, Ionizing, Thermal/Infrared, Ultraviolet 

• Temperature extremes-high, low, or rapid changes 

• Impact/collision-meteoroids 

There are emerging techniques in Design under Uncertainty that can address the controller’s sensitivity to random 
environmental inputs. 

Metrics for an Autonomous Control System. — A benchmark is a standard task, representative of problems that 
will occur frequently in real domains. A testbed is an environment to test standard tasks. It enables this by providing 
(hardware and software) tools for data collection, for external parameter control, and for scenario generation. 

Since autonomous control of a NHS-based power system for lunar base applications is a relatively new field, 
there are no benchmarks or testbeds. So, relevant metrics may initially be hard to assess. However, based on 
experience in other diagnostic systems, some suggested metrics are: 

• Accuracy-false/missed alerts, robustness to real-world effects (noise). 

• Speed-hard real-time critical versus near real-time. 

• Coverage-multiple/intermittent faults, subsystem-level versus system-level. 

• Cost-time to develop, integrate and test, additional hardware requirements. 

• Ease of Engineering-modular, reusable, scalable, understandable. 

Software Quality Assurance and V& V. — The software implemented for power system control will be a safety- 
critical, real-time system. It must satisfy explicit and very tight response time constraints. In addition, it must 
integrate with external software for the sharing of information. To implement such high performance software, 
appropriate software engineering practices must be followed. This encompasses all phases of software design from 
requirements definition through release and maintenance and includes definition of executable specifications, 
reliable coding practices and code generation, performance analysis, and verification and validation of the system. 

To facilitate software debugging and upgrades, a modular (object oriented) architecture should be employed. 

VI. Modeling Needs for Autonomous Control 

Modeling needs for an autonomous system are heavily dependent on the ultimate use of the model. Initially, the 
modeling and simulation tools that are employed to support the development of a new design or application focus 
primarily on scaling up (or down) similar existing systems and emphasize system configuration, nominal steady 
state performance, and minimum payload launch weight. While these models are usually sufficient to predict steady- 
state behavior, they are not adequate for analyzing the system at the level of fidelity required for diagnostic and 
prognostic applications and fault accommodating control. Dynamic analysis of these systems is usually delayed until 
the designs are set and components built. Ideally, initial dynamic model-based analyses should be part of the trade- 
space studies to ensure that the component operates satisfactorily under steady-state and transient conditions as part 
of the integrated plant. The models used for these initial studies may (depending on the technology investigated) be 
low-fidelity, with parameter values bounded by extremes to provide worst and best-case transient performance. As 
designs are finalized, high-fidelity models and simulations that can more accurately predict system transient 
behavior under any condition are required. These models would allow the injection of failures typically experienced 
by the system and track their effects as they propagate throughout the system. The data generated would be used to 
develop diagnostic strategies to assess the state of the system. In addition, failure models would characterize system 
wear and fatigue as well as enable the prediction of failures, which would support on-base as well as high fidelity 
prognostic capabilities. High fidelity models, however, are typically non real-time and inappropriate for use in 
control system implementation. Models that support prognostics and are used to predict component life based on 
current and historical data can provide a basis for future decisions. Such decisions can result in system 
reconfiguration that may be preformed several days in the future. These models may be more detailed and do not 
necessarily need to be real time-if the added complexity is warranted, these models may be run back on earth in 
supercomputing environments if necessary. 
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Development and validation systems (i.e., the infrastructure discussed previously) should employ a variety of 
low/med/high fidelity models, configured as simulation modules, to provide a means to rapidly determine the fitness 
of a group of sensors or controller algorithms given appropriate performance measures. To this end, the existence, 
scope, and availability of existing simulation models for the major subsystems and components should be 
determined immediately, with these models leveraged if possible, and adapted for use in a common model 
development environment. These same modules should be easily transformed into models suitable for use in model- 
based control and diagnostic systems i.e., real-time embedded system applications. 

The level of fidelity of on-base models will in part be dictated by the processing hardware chosen for the 
mission. Model uncertainty can be incorporated into controller design and be determined by comparing the lower 
fidelity models used with higher fidelity models or test data. Model fidelity and complexity are also determined by 
the degree of autonomy desired i.e., some component details may have a simplified representation in on-base models 
if a more accurate representation is not essential to the corrective action taken. In the absence of high fidelity 
dynamic models, assume worst-case uncertainties and interaction among the subsystems. The following guidelines 
provide example subsystem model capabilities that would typically support autonomous system development: 

• The model should be capable of providing response characteristics for routine evolutions such as startup- 
shutdown, load following, etc. These characteristics may be lifetime dependent (e.g., dependent on NHS fuel 
burnup) for possible use in nonlinear gain-scheduled or model-predictive control schemes. 

• The model should be capable of simulating key high-bandwidth transients and faults. This may require the 
use of stiff-system, adaptive time step numerical integration techniques. These models may require order 
reduction and conversion to constant time step solvers for use in real-time, on-base control and diagnostic 
systems. 

• Systematic methods for producing high fidelity reduced order models for on-base control and diagnostics. 
Research currently underway in the jet propulsion industry involves deriving low-order models, suitable for 
use in on-board model predictive control schemes, from high-order, non-linear performance models. This 
expertise should be leveraged. 

• Component models of devices, whose malfunction may require system reconfiguration such as control drives, 
coolant pumps, valves, etc., should be of sufficient complexity to represent the specified fault. 

• Sensor models capable of providing appropriate bandwidth characteristics (e.g., accelerometers used for 
structural health and component monitoring) as well as malfunctions such as drift, bias etc., are also required. 

• Sensors specific to a component need to have modeling features that support any unique diagnostic 
information that the sensor may provide. For example, models of NHS SiC ex-core neutron sensors should 
accommodate neutron noise and power distribution irregularities. 

• The ability to locate sensors at random, physically acceptable locations in the system is essential for optimal 
sensor placement studies. 

VII. Instrumentation Needs for Lunar Base Nuclear Power Systems 

Some of the more pressing challenges for sensor development are in the areas of long-life, radiation-hardened 
sensors and electronics. Sensors for terrestrial nuclear and aerospace applications (ref. 12), which have historically 
consisted of accelerometers for vibration, thermocouples and resistance detectors for temperature, venturis or orifice 
plates combined with differential pressure transmitters for flow, and ion chambers for NHS power measurements are 
presently not capable of meeting the requirements imposed by long-duration lunar base missions e.g., 15 year 
maintenance-free continuous operation. NHS power monitors using gas-filled ion chambers require periodic 
maintenance, and will likely be replaced by solid-state detectors, such as the SiC-based detectors currently under 
investigation by the terrestrial nuclear industry (refs. 13 to 15). Other instalments operating in and in close 
proximity to the nuclear heat source must possess long-term tolerance to high temperatures as well as a mixed 
neutron-gamma field. The radiation fields experienced by the remainder of the equipment would have neutron dose 
rates significantly diminished due to primary (i.e., NHS) system shielding, with the dose rates consistent with the 
lunar background (ref. 3). Electronics utilizing SiC-based components are well suited for high temperature 
operation, which may result in reduced cooling requirements. In addition to harsh environment constraints, size and 
weight limitations as well as minimizing the number of penetrations in piping and the primary shield also impact the 
instalments (sensors and electronics) selected. 

Sensor redundancy would, to some degree, address issues associated with periodic maintenance-rather than 
removing and replacing a malfunctioning sensor (obviously not an option with the hypothetical mission presented), 
simply switch the redundant sensor into service in the same (approximate) location. However, since the redundant 
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sensors would be located in close proximity to the original on-line sensor, they too would need to satisfy the same 
lifetime requirements with respect to the operating environment. Only issues associated with sensor operation e.g., 
deterioration due to application of an electric field or unanticipated malfunctions, would be accommodated by 
redundancy. For space missions where launch-time payload size and weight are of concern (i.e., limited “real- 
estate”), MEMS-based sensors would seem to provide a means for simultaneously satisfying the redundancy 
requirement and space limitations (refs. 16 and 17). 

To minimize the non-environmental dose received by the computing equipment, as much of the instrumentation 
as possible would be located at points distant from the NF1S. Use of radiation-tolerant electronics would have the 
added benefit of reducing primary shield weight, thereby reducing the total spacecraft weight at launch. To 
minimize the number of instrument cables (and shielding penetrations) required, radiation-hardened multiplexers 
would be required (ref. 9). The state of the art in radiation-hardened electronics relies on components constructed 
primarily of SiC (ref. 15). Further development in this area is needed to provide devices that provide the lifetimes 
required. Piping penetrations could be minimized by advancement of carbide-compound, MEMS-based remote 
sensing (where sensors are implanted into the walls of the pipes (ref. 19)) and passive acoustic tomography 
technologies (ref. 20). Development of minimally-intrusive thin-film multifunction sensors to provide, for example, 
simultaneous measurements of strain, heat flux, and flow in a single package is an example of existing research that 
could be leveraged in this area (ref. 16). All of the sensors in the trade-space must also provide the bandwidth 
characteristics necessary for monitoring system transients and fault detection (e.g., accelerometers used for 
structural health and component monitoring). With the sensors required by each system determined and well- 
characterized, sensor placement studies may commence. 

In the power and process industries, sensors are often placed in the most convenient locations dictated by the 
system designers. These locations may not be optimal in terms of the quality’ of information the sensors could 
provide with regards to monitoring the overall health of the system. In the initial stages of system design there is 
usually some latitude with regards to sensor location. Some sensor locations may be considered optimal over others 
from the standpoint of detecting faults, and adding extra sensors in key locations may allow estimation of 
component efficiencies and provide the information necessary for analytical redundancy of some sensors. 
Analytical redundancy refers to the model-based estimation of an otherwise sensed parameter, such as temperature 
or pressure, for the purposes of verifying the information coming from the actual sensor. The estimation is typically 
based on measurements from sensors other than the one being verified. Many times, information on parameters that 
can not be directly measured (e.g., flow rates, temperatures, efficiencies) can be provided by a model-based 
analytical sensor , providing an estimate of the parameter of interest (ref. 21). 

Systematic sensor selection (SSS) is an enabling technology for sensor placement currently applied to rocket 
engines (ref. 22). Figure 4 provides an overview of the SSS process. The process requires three categories of 
information. The first category consists of identifying targeted fault and sensing requirements. Risk reduction factors 
are associated with each fault to enable ranking of the faults in terms of the risk reduced by timely identification. 
The second category contains candidate sensor information which includes sensor type and location (a “sensor 
suite”) and estimated variance for normal operation. The third category defines fault scenarios that correspond to the 
targeted faults in terms of hardware-specific indications. Combined, the three categories effectively condense 
available engineering experience and process physics required for sensor selection. As shown in figure 4, a 
nonlinear (or linear) inverse process model provides a mapping from the trial sensor suite to the corresponding 
system states. The states calculated by the inverse model are compared to a set of “true” states (which may be based 
on test data or a high-fidelity model), with the difference used as an input to a quality measure (merit function). The 
corresponding merit values are used by a genetic algorithm that iteratively improves the quality measure by 
adjusting the contents of the sensor suite (sensor type or location). Speed and fidelity of fault detection, support for 
risk reduction, and fault source discrimination are examples of additional input quantities to the quality measure. 
Finally, a statistical evaluation algorithm is used to determine the probability that the fault identification produced 
by the optimal (or near-optimal) sensor suite might be confounded by control system-induced sensor variation or 
noise. A final assessment that assimilates all of the recommended sensor locations will determine the best overall 
sensor suite. Final sensor suite selection must be complete prior to the design and implementation of any control, 
diagnostic, or health management system. 
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Figure 4. — Overview of the systematic sensor selection process. 


VIII. The Role of Health Management in Autonomous Control 

Health management (HM) technologies are utilized and applied to systems where a “supervisor” understands and 
assimilates system information so that mission goals are protected and robust operation is ensured. The structures of 
these HM systems are composed of software algorithms that can be as diverse as the systems onto which they might 
be applied (refs. 23 to 25). There are, however, fundamental “building-blocks” in a basic architecture that any HM 
system can utilize, and this is illustrated in figure 5. 

The foundation of any well-designed HM system is a well-defined system and concept of operations. Since the 
failure, modes, and effects analysis (FMEA) and system requirements naturally follow from these two elements and 
form the basis for the HM system, without these two essential elements the HM system can not be adequately 
designed. In addition, high risk failure modes will need to be identified that the HM system must be able to detect 
and isolate. Optimal selection of sensors and their locations within the system will enable the greatest potential for 
successful health diagnosis and prognosis while at the same time maximizing risk reduction for the system as a 
whole. The diagnostics and prognostics capabilities of the HM system are comprised of software algorithms that can 
perform data validation, fault detection, fault isolation, and information fusion; this ultimately provides information 
for any recommended control actions, maintenance plans, and refinement of system operations. In the end, the HM 
system needs to protect the system it was designed to manage in an efficient, robust, and timely manner. 

The need for HM in autonomous lunar base power system operation is illustrated via a subsystem common to 
virtually all nuclear systems currently in the trade space, the power management and distribution system. Advanced 
PMAD technologies will enable the future success of long duration space-flight missions, new launch vehicles, 
operation of ground- or space-based stations, and deployment of satellites (refs. 26 and 27). Illustrated in figure 6 are 
some PMAD components and their relationship to others within the power subsystem. 
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Figure 5. — Systematic health management development. 


The function of the PMAD subsystem has a direct influence on the operation of the other subsystems. For 
example, with an unhealthy PMAD, there could be an interruption in power to the rest of the system, which might 
disable the communications system or the electronics for the control systems. Therefore, the detection and isolation 
of a fault in this subsystem is important to the overall function of the entire system. In other words, the “health” of 
the PMAD subsystem is inherently important to the overall “health” of the entire system. 

Fault modes in the PMAD subsystem manifest themselves in many ways. A problem due to an electrical 
degradation might be exposed as a failure or declining performance in a component. Likewise, arc, leakage, and 
corona are “hidden” system electrical faults that are hard to detect. A health management system for the PMAD 
subsystem will need to detect and isolate faults like the ones listed above and recommend mitigating actions so that 
they do not become critical. 

Current research on modular electrical power systems indicate improvements in reliability and safety, while at 
the same time lowering development and operational costs, by using unique distribution topologies and 
“modularizing” the power system to provide higher degrees of flexibility in the PMAD. However, this increases 
complexity and indicates the need for a health management system that can analyze the power system data, assess 
system health, and optimally reconfigure the power system autonomously. In addition this will also reduce the 
amount of ground support personnel and support systems needed over the life of a mission. The integration of 
PMAD health management technology will greatly enhance the system reliability and safety of power systems that 
will be required in any Office of Exploration Systems program. 
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IX. Leveraging Past Experience in Autonomous Control System Development 

Redevelopment of technology is costly and time consuming. Tight schedules and limited financial resources 
warrant adaptation of existing techniques whenever possible. Significant progress has been made in intelligent and 
autonomous system control within the aerospace and power industries (refs. 8, 24, 28, and 29). Many of these 
control system designs can draw upon techniques developed for jet and rocket engines. While this research may 
address a fundamentally different phenomenon, the knowledge gained and solutions developed may provide insight 
into control of the power system component (or subsystem) considered. 

Current research into engine life extending control techniques (ref. 30) (such as determination of optimal 
acceleration schedules to minimize component deterioration) may be an example of an existing technology that can 
be directly leveraged for components such as the Brayton power conversion system, which are similar in dynamic 
system characteristics to gas turbine-based turbofan engines. The following provides some specific examples of 
state-of-the-art technologies that can be applied to the hypothetical space nuclear system described in section II. 

Robust MIMO Control System Design Strategies . — Robust MIMO (Multiple Input Multiple Output) control 
system design strategies (e.g., H , optimal control) have been used extensively in the process and active vibration 
control areas for over a decade (ref. 31 and 32). More recently, this technique has been successfully applied to 
research NHS (refs. 33 and 34). Opportunities exist to apply supervisory systems (e.g., based on neural network or 
fuzzy logic architectures) to coordinate the efforts of MIMO robust NHS and energy conversion plant subsystem 
controllers. MIMO robust control techniques require quantification of plant uncertainties, which are then 
incorporated into the controller design to accommodate unknown or un-modeled plant dynamics. Plant uncertainty is 
incorporated via transfer functions, determined experimentally or via appropriate low order linear models of higher 
order non-linear models. As system designs (and simulation models) mature, Model Predictive Control techniques, 
which employ a faster-than-real-time model to determine optimal actuator commands given a set of constraints, 
could also be implemented. These on-base models could be periodically updated and loaded into the control system 
processor using data acquired during operation. 

Sensor Fusion Techniques . — Sensor fusion is a method used to reduce the uncertainty associated with the 
monitoring of specific operating parameters and identification of component malfunctions (ref. 35). Fusing 
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information from different sources is currently used to enhance the reliability of aircraft engines (refs. 21 and 36) 
and could be applied to NHS/energy conversion systems using, for example, neutron noise, temperature, pressure, 
and vibration sensor signals. In the absence of measured parameters, model-based estimation techniques (e.g., 
Kalman filtering (ref. 35) could be employed to provide best-estimate analytical sensors for key nuclear heat source 
core/power conversion system parameters, such as dynamic power distributions and compressor (turbine) 
efficiencies, in real-time. 

Fault Detection Isolation (FDI). — Fault Detection/Isolation (FDI) algorithm development requires defining 
fault scenarios and corresponding plant parameter feature sets used to identify anticipated faults (refs. 37 and 38). 

For components in the final stages of design and testing, Failure Mode and Effects Criticality Analyses (FMECA) 
(ref. 10) may be used to define fault features. Data-based techniques, relying on the neural network and fuzzy logic 
architectures for fault classification (ref. 28), as well as model-based techniques, such as the Kalman Filter estimator 
(ref. 39), are frequently used in FDI systems. The output of the FDI system can in-turn be used by a fault 
accommodating reconfigurable control scheme; this of course would require the FDI system to operate in real time. 

Reliability Enhancing/Life Extending Control. — One of the more mature energy conversion technologies in the 
trade-space i.e., that based on the closed Brayton cycle (CBC), may have issues associated with its reliability, 
primarily due to the need for moving parts, bearings, etc. The higher efficiencies of the CBC translate into lower 
material temperatures throughout the system with correspondingly longer component lifetimes. One focus area may 
be developing control and diagnostic systems that enhance the operational reliability of CBC-based energy 
conversion systems. As mentioned previously, life extending control utilizes optimally-calculated system trajectories 
to minimize cycling, temperature excursions, etc., to extend component lifetimes, with system uncertainty 
accommodated using robust control methods documented in the literature (refs. 30 and 40). 

Health Management Diagnostic Inference Engines. — Model-based diagnosis allows the development of 
diagnostic systems that are based on physical models rather than sets of rules alone. Model-based diagnosis is more 
flexible and, in general, employs models similar to the simulation models used for controller development. 
Livingstone is an example of a model-based software tool that can be used for fault detection, isolation, and 
recovery and is responsible for inferring the health of the system on which it has been applied. It does this by 
utilizing a qualitative model and discretized sensor and event information. As event data are received, Livingstone 
continually updates its understanding of the state of the various components in the system. The model is used to 
determine the expected observations given the component state. When there is a discrepancy between the expected 
observations and the actual observations, Livingstone searches for the most likely set of component states/failures 
that could produce the observation. In addition, it can also generate recommended recovery actions. Some 
applications where Livingstone has been utilized as a diagnostic engine (ref. 41) are: DS-1 Remote Agent 
Experiment, Space Shuttle Main Engine, Command and Data Handling System of the International Space Station, 
PITEX X-34 main propulsion system demonstration, and NASA’s Earth Observing One satellite. 

Infrastructure development. — Many of the techniques mentioned above may, to some degree, be taken “off-the- 
shelf’ and applied to lunar surface nuclear power systems with relative ease (albeit used in other, seemingly 
unrelated industries and applications). However, the development infrastructure needs to be in place up front to 
efficiently and prudently do so. For example, component models, rapid prototyping environments, hardware-in-loop 
test beds, etc., which provide a means for rapidly assessing and validating the capabilities of candidate algorithms 
need to be established as early as possible. Commercial off-the-shelf (COTS) software products such as MATLAB 
and SIMULINK (ref. 42) (for dynamic model and controller development), Impact Technologies PHM Design 
(diagnostic system development) (ref. 43) and dSPACE (ref. 44) (for Hardware-in-the-loop testing), that can be 
seamlessly integrated together for controller development and testing need to be used whenever possible. 
Additionally, software already developed for government research projects can also be utilized if they are mature 
enough to be integrated. As mentioned previously, NASA Glenn has developed software to perform sensor selection 
and test data validation for rocket engines (ref. 22). Oak Ridge National Laboratory has developed a system for 
prototyping terrestrial nuclear system controller algorithms (ref. 45). Both of these example platforms, however, are 
only effective when used with accurate dynamic models that provide the essential system characteristics as 
described in section VI. 


X. Conclusions 

This paper has presented an overview of the enabling control technologies necessary for the development of a 
reliable, autonomous lunar base nuclear power system. Optimal sensor placement and a hierarchical, supervisory 
control architecture would allow different levels of autonomy at various system levels. High fidelity system, 
subsystem and component models would allow for diagnostic and prognostic applications as well as fault 
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accommodating control. For the lunar environment long-life, radiation-hardened sensors and electronics would be 
required. Health management technologies, if employed, would ensure that mission goals are successfully achieved. 
Each technology presents a significant research effort in its own right and therefore needs to be addressed prior to 
model development, in parallel with the system design in order to have a positive impact on the integrated system. 
Such an approach would enable the identification of control related as well as integrated system related problem 
areas early in the development; avoiding potentially expensive “work-arounds” by making necessary changes in the 
design prior to development. 
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